Evolutionary game theory and population dynamics 



Jacek Mi§kisz 
Institute of Applied Mathematics 
and Mechanics 
Warsaw University 
ul. Banacha 2 
02-097 Warsaw, Poland 
e-mail: miekisz@mimuw.edu.pl 

February 4, 2008 



1 Short overview 

We begin these lecture notes by a crash course in game theory. In particular, we introduce a fundamen- 
tal notion of a Nash equilibrium. To address the problem of the equilibrium selection in games with 
multiple equilibria, we review basic properties of the deterministic replicator dynamics and stochastic 
dynamics of finite populations. 

We show the almost global asymptotic stability of an efficient equilibrium in the replicator dynamics 
with a migration between subpopulations. We also show that the stability of a mixed equilibrium 
depends on the time delay introduced in replicator equations. For large time delays, a population 
oscillates around its equilibrium. 

We analyze the long-run behaviour of stochastic dynamics in well-mixed populations and in spatial 
games with local interactions. We review results concerning the effect of the number of players and 
the noise level on the stochastic stability of Nash equilibria. In particular, we present examples 
of games in which when the number of players increases or the noise level decreases, a population 
undergoes a transition between its equilibria. We discuss similarities and differences between systems 
of interacting players in spatial games maximizing their individual payoffs and particles in lattice-gas 
models minimizing their interaction energy. 

In short, there are two main themes of our lecture notes: the selection of efficient equilibria 
(providing the highest payoffs to all players) in population dynamics and the dependence of the long- 
run behaviour of a population on various parameters such as the time delay, the noise level, and the 
size of the population. 
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2 Introduction 



Many socio-economic and biological processes can be modeled as systems of interacting individuals; see 
for example econophysics bulletin [16] and statistical mechanics and quantitative biology archives [13]. 
One may then try to derive their global behaviour from individual interactions between their basic 
entities such as animals in ecological and evolutionary models, genes in population genetics and people 
in social processes. Such approach is fundamental in statistical physics which deals with systems of 
interacting particles. One can therefore try to apply methods of statistical physics to investigate the 
population dynamics of interacting individuals. There are however profound differences between these 
two systems. Physical systems tend in time to states which are characterized by the minimum of 
some global quantity, the total energy or free energy of the system. Population dynamics lacks such 
general principle. Agents in social models maximize their own payoffs, animals and genes maximize 
their individual darwinian fitness. The long-run behavior of such populations cannot in general be 
characterized by the global or even local maximum of the payoff or fitness of the whole population. 
We will explore similarities and differences between these systems. 

The behaviour of systems of interacting individuals can be often described within game-theoretic 
models [48, 24, 25, 103, 100, 79, 36, 106, 27, 14, 37, 66, 67, 68]. In such models, players have at their 
disposal certain strategies and their payoffs in a game depend on strategies chosen both by them and 
by their opponents. The central concept in game theory is that of a Nash equilibrium. It is an 
assignment of strategies to players such that no player, for fixed strategies of his opponents, has an 
incentive to deviate from his current strategy; no change can increase his payoff. 

In Chapter 3, we present a crash course in game theory. One of the fundamental problems in 
game theory is the equilibrium selection in games with multiple Nash equilibria. Some two-player 
symmetric games with two strategies, have two Nash equilibria and it may happen that one of them is 
payoff dominant (also called efficient) and the other one is risk-dominant. In the efficient equilibrium, 
players receive highest possible payoffs. The strategy is risk-dominant if it has a higher expected payoff 
against a player playing both strategies with equal probabilities. It is played by individuals averse to 
risk. One of the selection methods is to construct a dynamical system where in the long run only one 
equilibrium is played with a high frequency. 

John Maynard Smith [46, 47, 48] has refined the concept of the Nash equilibrium to include the 
stability of equilibria against mutants. He introduced the fundamental notion of an evolutionarily 
stable strategy. If everybody plays such a strategy, then the small number of mutants playing a 
different strategy is eliminated from the population. The dynamical interpretation of the evolutionarily 
stable strategy was later provided by several authors [94, 35, 109]. They proposed a system of difference 
or differential replicator equations which describe the time-evolution of frequencies of strategies. Nash 
equilibria are stationary points of this dynamics. It appears that in games with a payoff dominant 
equilibrium and a risk-dominant one, both are asymptotically stable but the second one has a larger 
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basin of attraction in the replicator dynamics. 

In Chapter 4, we introduce replicator dynamics and review theorems concerning asymptotic 
stabihty of Nash equihbria [103, 36, 37]. Then in Chapter 5, we present our own model of the replicator 
dynamics [55] with a migration between two subpopulations for which an efficient equilibrium is almost 
globally asymptotically stable. 

It is very natural, and in fact important, to introduce a time delay in the population dynamics; 
a time delay between acquiring information and acting upon this knowledge or a time delay between 
playing games and receiving payoffs. Recently Tao and Wang [92] investigated the effect of a time 
delay on the stability of interior stationary points of the replicator dynamics. They considered two- 
player games with two strategies and a unique asymptotically stable interior stationary point. They 
proposed a certain form of a time-delay differential replicator equation. They showed that the mixed 
equilibrium is asymtotically stable if a time delay is small. For sufficiently large delays it becomes 
unstable. 

In Chapter 6, we construct two models of discrete-time replicator dynamics with a time delay 
[2] . In the social-type model, players imitate opponents taking into account average payoffs of games 
played some units of time ago. In the biological-type model, new players are born from parents who 
played in the past. We consider two-player games with two strategies and a unique mixed Nash 
equilibrium. We show that in the first type of dynamics, it is asymptotically stable for small time 
delays and becomes unstable for large ones when the population oscillates around its stationary state. 
In the second type of dynamics, however, the Nash equilibrium is asymptotically stable for any time 
delay. Our proofs are elementary, they do not rely on the general theory of delay differential and 
difference equations. 

Replicator dynamics models population behaviour in the limit of the infinite number of individuals. 
However, real populations are finite. Stochastic effects connected with random matchings of players, 
mistakes of players and biological mutations can play a significant role in such systems. We will discuss 
various stochastic adaptation dynamics of populations with a fixed number of players interacting in 
discrete moments of time. In well-mixed populations, individuals are randomly matched to play a 
game [40, 76, 54]. The deterministic selection part of the dynamics ensures that if the mean payoff 
of a given strategy is bigger than the mean payoff of the other one, then the number of individuals 
playing the given strategy increases. However, players may mutate hence the population may move 
against a selection pressure. In spatial games, individuals are located on vertices of certain graphs 
and they interact only with their neighbours; see for example [62, 63, 64, 5, 17, 106, 18, 44, 41, 7, 86, 
87, 89, 30, 31, 32, 33] and a recent review [90] and references therein. In discrete moments of times, 
players adapt to their opponents by choosing with a high probability the strategy which is the best 
response, i.e. the one which maximizes the sum of the payoffs obtained from individual games. With 
a small probability, representing the noise of the system, they make mistakes. The above described 
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stochastic dynamics constitute ergodic Markov chains with states describing the number of individuals 
playing respective strategies or corresponding to complete profiles of strategies in the case of spatial 
games. Because of the presence of random mutations, our Markov chains are ergodic (irreducible and 
periodic) and therefore they possess unique stationary measures. To describe the long-run behavior 
of such stochastic dynamics, Foster and Young [22] introduced a concept of stochastic stability. A 
configuration of the system is stochastically stable if it has a positive probability in the stationary 
measure of the corresponding Markov chain in the zero-noise limit, that is the zero probability of 
mistakes. It means that in the long run we observe it with a positive frequency along almost any time 
trajectory. 

In Chapter 7, we introduce the concept of stochastic stability and present a useful representation 
of stationary measures of ergodic Markov chains [107, 23, 83]. 

In Chapter 8, we discuss populations with random matching of players in well-mixed popula- 
tions. We review recent results concerning the dependence of the long-run behavior of such systems 
on the number of players and the noise level. In the case of two-player games with two symmetric 
Nash equilibria, an efficient one and a risk-dominant one, when the number of players increases, the 
population undergoes twice a transition between its equilibria. In addition, for a sufficiently large 
number of individuals, the population undergoes another equilibrium transition when the noise 
decreases. 

In Chapter 9, we discuss spatial games. We will see that in such models, the notion of a Nash 
equilibrium (called there a Nash configuration) is similar to the notion of a ground-state configuration 
in classical lattice-gas models of interacting particles. We discuss similarities and differences between 
systems of interacting players in spatial games maximizing their individual payoffs and particles in 
lattice-gas models minimizing their interaction energy. 

The concept of stochastic stability is based on the zero-noise limit for a fixed number of players. 
However, for any arbitrarily low but fixed noise, if the number of players is large enough, the probability 
of any individual configuration is practically zero. It means that for a large number of players, 
to observe a stochastically stable configurations we must assume that players make mistakes with 
extremely small probabilities. On the other hand, it may happen that in the long run, for a low but 
fixed noise and sufficiently large number of players, the stationary configuration is highly concentrated 
on an ensemble consisting of one Nash configuration and its small perturbations, i.e. configurations 
where most players play the same strategy. We will call such configurations ensemble stable. It 
will be shown that these two stability concepts do not necessarily coincide. Wc will present examples 
of spatial games with three strategies where concepts of stochastic stability and ensemble stability do 
not coincide [51, 53]. In particular, we may have the situation, where a stochastically stable strategy 
is played in the long run with an arbitrarily low frequency. In fact, when the noise level decreases, the 
population undergoes a sharp transition with the coexistence of two equilibria for some noise level. 
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Finally, we discuss the influence of dominated strategies on the long-run behaviour of population 
dynamics. 

In Chapter 10, we shortly review other results concerning stochastic dynamics of finite populations. 

3 A crash course in game theory 

To characterize a game-theoretic model one has to specify players, strategies they have at their disposal 
and payoffs they receive. Let us denote by / = {!,... ,n} the set of players. Every player has at his 
disposal m different strategies. Let S = {1, ...,m} be the set of strategies, then Q. = is the set of 
strategy profiles, that is functions assigning strategies to players. The payoff of any player depends not 
only on his strategy but also on strategies of all other players. If X G fJ, then we write X = {Xi, 
where Xi £ S is a strategy of the i-th player and X-i G S^~^''^ is a strategy profile of remaining 
players. The payoff of the i-th player is a function defined on the set of profiles, 

Ui : Q ^ R, i, n 

The central concept in game theory is that of a Nash equilibrium. An assignment of strategies 
to players is a Nash equilibrium, if for each player, for fixed strategies of his opponents, changing 
his current strategy cannot increase his payoff. The formal definition will be given later on when we 
enlarge the set of strategies by mixed ones. 

Although in many models the number of players is very large (or even infinite as we will see later 
on in replicator dynamics models), their strategic interactions are usually decomposed into a sum of 
two-player games. Only recently, there have appeared some systematic studies of truly multi-player 
games [42, 10, 11, 75]. Here we will discuss only two-player games with two or three strategies. We 
begin with games with two strategies, A and B. Payoffs functions can be then represented by 2 x 2 
payoff matrices. A general payoff matrix is given by 

A B 
A a b 

U = 

Bed, 

where Uki, k,l = A,B, is a payoff of the first (row) player when he plays the strategy k and the 
second (column) player plays the strategy I. 

We assume that both players are the same and hence payoffs of the column player are given by the 
matrix transposed to U ; such games are called symmetric. In this classic set-up of static games (called 
matrix games or games in the normal form), players know payoff matrices, simultaneously announce 
(use) their strategies and receive payoffs according to their payoff matrices. 
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We will present now three main examples of symmetric two-player games with two strategies. We 
begin with an anecdote, then an appropriate game-theoretic model is build and its Nash equilibria are 
found. 

Example 1 (Stag-hunt game) 

Jean-Jacques Rousseau wrote, in his Discourse on the Origin and Basis of Equality among Men, about 
two hunters going either after a stag or a hare [72, 24]. In order to get a stag, both hunters must be 
loyal one to another and stay at their positions. A single hunter, deserting his companion, can get 
his own hare. In the game-theory language, we have two players and each of them has at his disposal 
two strategies: Stag (St) and Hare (H). In order to present this example as a matrix game we have to 
assign some values to animals. Let a stag (which is shared by two hunters) be worth 10 units and a 
hare 3 units. Then the payoff matrix of this symmetric game is as follows: 

St H 
St 5 

U = 

H 3 3 

It is easy to see that there are two Nash equilibria: {St, St) and {H, H) . 

In a general payoff matrix, if a > c and d> b, then both {A, A) and {B, B) are Nash equilibria. If 
a + h < c+d, then the strategy B has a higher expected payoff against a player playing both strategies 
with the probability 1/2. We say that B risk dominates the strategy A (the notion of the risk- 
dominance was introduced and thoroughly studied by Harsanyi and Selten [29]). If at the same time 
a > d, then we have a selection problem of choosing between the payoff-dominant (Pareto-efficient) 
equilibrium {A, A) and the risk-dominant {B,B). 

Example 2 (Hawk-Dove game) 

Two animals are fighting for a certain territory of a value V. They can be either aggressive (hawk 
strategy - H) or peaceful (dove strategy - D). When two hawks meet, they accure the cost of fighting 
C > V and then they split the territory. When two dove meets, they split the territory without a 
fight. A dove gives up the territory to a hawk. We obtain the following payoff matrix: 
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H 



D 



H 



(V-C)/2 



V 



u = 



D 







V/2, 



The Hawk-Dove game was analyzed by John Maynard Smith [48]. It is also known as the Chicken 
game [77] or the Snowdrift game [31]. It has two non-symmetric Nash equilibria: {H, D) and H). 

Example 3 (Prisoner's Dilemma) 

The following story was discussed by Melvin Drcshcr, Merill Flood, and Albert Tucker [4, 73, 84]. 
Two suspects of a bank robbery are caught and interrogated by the police. The police offers them 
separately the following deal. If a suspect testifies against his colleague (a strategy of defection - D), 
and the other docs not (cooperation - C), his sentence will be reduced by five years. If both suspects 
testify, that is defect, they will get the reduction of only one year. However, if they both cooperate 
and do not testify, their sentence, because of the lack of a hard evidence, will be reduced by three 
years. We obtain the following payoff matrix: 

C D 
C 3 

U = 

D 5 1 

The strategy C is a dominated strategy - it results in a lower payoff than the strategy D, 
regardless of a strategy used by the other player. Therefore, (D, D) is the unique Nash equilibrium 
but both players are much better off when they play C - this is the classic Prisoner's Dilemma. 

A novel behaviour can appear in games with three strategies. 

Example 4 (Rock-Scissors-Paper game) 

In this game, each of two players simultaneously exhibits a sign of either a scissors {S), a rock (i?), or 
a paper (P). The game has a cyclic behaviour: rock crashes scissors, scissors cut paper, and finally 
paper wraps rock. The payoffs can be given by the following matrix: 





R 


S 


P 


R 


1 


2 





S 





1 


2 


P 


2 





1 



It is easy to verify that this game, because of its cyclic behavior, does not have any Nash equilibria 
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as defined so far. However, we intuitively feel that when we repeat it many times, the only way not 
to be exploited is to mix randomly strategies, i.e. to choose each strategy with the probability 1/3. 

This brings us to a concept of a mixed stategy, a probability mass function on the set of pure 
strategies S. Formally, a mixed strategy x is an element of a simplex 

m 

A = {xeR"',0<Xk<l,Y,Xk = 1}. 

fc=i 

By the support of a mixed strategy x we mean the set of pure strategies with positive probabilities 
in X. Payoffs of mixed strategies are defined as appropriate expected values. In two-player games, a 
player who uses a mixed strategy x against a player with a mixed strategy y receives a payoff given 

by 

UkiXkVi- 

k,ies 

In general n-player games, profiles of strategies are now elements of B = A^. We are now ready to 
define formally a Nash equilibrium. 

Definition 1 X E & is a Nash equilibrium if for every i E I and every y G A, 

Ui{Xi,X_i)>Ui{y,X_i) 

In the mixed Nash equilibrium, expected payoffs of all strategies in its support should be equal. 
Otherwise a player could increase his payoff by increasing the probability of playing a strategy with 
the higher expected payoff. In two-player games with two strategies, we identify a mixed strategy 
with its first component, x = xi. Then the expected payoff of A is given by ax + 6(1 — x) and that of 
B by cx + d{l — x). x* = {d — b)/{d — b + a — c) for which the above two expected values are equal is 
a mixed Nash equilibrium or more formally, a profile {x,x) is a Nash equilibrium. 

In Examples 1 and 2, in addition to Nash equilibria in pure strategies, we have mixed equilibria, 
X* = 3/5 and x* = V/C respectively. It is obvious that the Prisoner's Dilemma game does not have 
any mixed Nash equilibria . On the other hand, the only Nash equilibrium of the Rock-Scissors-Paper 
game is a mixed one assigning the probability 1/3 to each strategy. 

We end this chapter by a fundamental theorem due to John Nash [59, 60]. 

Theorem 1 Every game with a finite number of players and a finite number of strategies has at least 
one Nash equilibrium. 

In any Nash equilibrium, every player uses a strategy which is a best reply to the profile of strategies 
of remaining players. Therefore a Nash equilibrium can be seen as a best reply to itself - a fixed point 
of a certain best-reply correspondence. Then one can use the Kakutani fixed point theorem to prove 
the above theorem. 
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4 Replicator dynamics 

The concept of a Nash equilibrium is a static one. Here we will introduce the classical replicator 
dynamics and review its main properties [103, 36, 37]. Replicator dynamics provides a dynamical way 
of achieving Nash equilibria in populations. We will see that Nash equilibria are stationary points of 
such dynamics and some of them are asymptotically stable. 

Imagine a finite but a very large population of individuals. Assume that they are paired randomly 
to play a symmetric two-player game with two strategies and the payoff matrix given in the beginning 
of the previous chapter. The complete information about such population is provided by its strategy 
profile, that is an assignment of pure strategies to players. Here we will be interested only in the 
proportion of individuals playing respective strategies. We assume that individuals receive average 
payoffs with respect to all possible opponents - they play against the average strategy. 

Let r.j(t), i = A,B, be the number of individuals playing the strategy A and B respectively at the 
time t. Then r{t) = r^(i) + ^^(i) is the total number of players and x{t) = is a fraction of the 
population playing A. 

We assume that during the small time interval e, only an e fraction of the population takes part 
in pairwise competitions, that is plays games. We write 

nit + e) = {1 - e)ri{t) + eri{t)Ui{ty, i = A,B, (1) 

where [/^(i) = ax{t) + b{l — x(t)) and Usit) = cx{t) + d{l — x{t)) are average payoffs of individuals 
playing A and B respectively. We assume that all payoffs are not smaller than hence va and tb are 
always non- negative and therefore < x < 1. 

The equation for the total number of players reads 



r{t + e) = {l-e)r{t) + er{t)U{t), (2) 

where tJ{t) = x{t)UA{t) + (1 — x{t))UB{t) is the average payoff in the population at the time t. When 
we divide (1) by (2) we obtain an equation for the frequency of the strategy A, 

x(* + .)-x(*) = .i(«^. ,3) 

1 — e + et/(t) 

Now we divide both sides of (3) by e, perform the limit e — > 0, and obtain the well known differential 
replicator equation: 

'^ = x{t)[UA{t)-U(t)]. (4) 
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The above equation can also be written as 



dx{t) 
dt 



x{t){l-x{t))[UA{t)-UBm 



= {a-c + d- h)x{t){l - x{t)){x{t) - x*) 



(5) 



For games with m strategies we obtain a system of m differential equations for Xk{t), fractions of 
the population playing the k-th strategy at the time t, k = 1, ...,m, 



where on the right hand-size of (6) there is a difference of the average payoff of the A;-th strategy and 
the average payoff of the population. The above system of differential equations or analogous difference 
equations, called replicator dynamics was proposed in [94, 35, 109]. For any initial condition x° G A, 
it has the unique global solution, ^{x'^,t), which stays in the simplex A. 

Now we review some theorems relating replicator dynamics and Nash equilibria [103, 36, 37]. We 
consider symmetric two-player games. We denote the set of strategies corresponding to symmetric 
Nash equilibria by 

^NE = E A : {x,x) is a Nash equilibrium}. 

It follows from the definition of the Nash equilibrium (see also discussion in the previous chapter 
concerning mixed strategies) that 

^NE = G A : u{i,x) = maxu{z,x) for every i in the support of x}. 

It is easy to see that 



is the set of stationary points of the replicator dynamics. 

It follows that symmetric Nash equilibria are stationary points of the replicator dynamics. 

Theorem 2 SU A^^ c A^ 

The following two theorems relate stability of stationary points to Nash equilibria [103, 36]. 
Theorem 3 If x £ A is Lyapunov stable, then x G A-^^. 
Theorem 4 If x^ & interior{A) and ^{x^,t) -^t-^oo x, then x G A-^-^. 





AO 



{x G A : u{i, x) = u{x, x) for every i in the support of x} 
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Below we present the replicator dynamics in the examples of two-player games discussed in the 
previous chapter. We write replicator equations and show their phase diagrams. 



Stag-hunt game 

dx 



= x{l — x){5x — 3) 



< • >• 

3/5 1 



Hawk-Dove game 



- = -x{l-x){x-C/V) 



> • < 

v/c 



Prisoner's Dilemma 



< 




dx 
'dt 



= —x{l — x){x + 1) 



We see that in the Stag-hunt game, both pure Nash equilibria are asymptotically stable. The 
risk-dominant one has the larger basin of attraction which is true in general because x* = {d — b)/{d — 
b + a — c) > 1/2 for games with an efficient equilibrium and a risk dominant one. 

In the Hawk-Dove game, the unique symmetric mixed Nash equilibrium is asymptotically stable. 

In the Prisoner's Dilemma, the strategy of defection is globally asymptotically stable. 

In the Rock-Scissors-Paper game, a more detailed analysis has to be done. One can show, 
by straithforward computations, that the time derivative of lnxiX2X3 is equal to zero. Therefore 
lnxiX2Xs = c is an equation of a closed orbit for any constant c. The stationary point (1/3, 1/3, 1/3) 
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of the replicator dynamics is Lyapunov stable and the population cycles on a closed trajectory (which 
depends on the initial condition) around its Nash equilibrium. 

5 Replicator dynamics with migration 

We discuss here a game-thcorctic dynamics of a population of replicating who can migrate between 
two subpopulations or habitats [55]. Wc consider symmetric two-player games with two strategies: A 
and B. We assume that a > d > c, d > b, and a + b<c + d in a general payoff matrix given in the 
beginning of Chapter 3. Such games have two Nash equilibria: the efficient one (A, A) in which the 
population is in a state with a maximal fitness (payoff) and the risk-dominant {B, B) where players 
are averse to risk. We show that for a large range of parameters of our dynamics, even if the initial 
conditions in both habitats are in the basin of attraction of the risk-dominant equilibrium (with respect 
to the standard replication dynamics without migration), in the long run most individuals play the 
efficient strategy. 

We consider a large population of identical individuals who at each time step can belong to one of 
two different non-overlapping subpopulations or habitats which differ only by their replication rates. 
In both habitats, they take part in the same two-player symmetric game. Our population dynamics 
consists of two parts: the standard replicator one and a migration between subpopulations. Individuals 
are allowed to change their habitats. They move to a habitat in which the average payoff of their 
strategy is higher; they do not change their strategies. 

Migration helps the population to evolve towards an efficient equilibrium. Below we briefly de- 
scribe the mechanism responsible for it. If in a subpopulation, the fraction of individuals playing the 
efficient strategy A is above its unique mixed Nash equilibrium fraction, then the expected payoff 
of A is bigger than that of B in this subpopulation, and therefore the subpopulation evolves to the 
efficient equilibrium by the replicator dynamics without any migration. Let us assume therefore that 
such fraction is below the Nash equilibrium in both subpopulations. Without loss of generality we 
assume that initial conditions are such that the fraction of individuals playing A is bigger in the first 
subpopulation than in the second one. Hence the expected payoff of A is bigger in the first subpop- 
ulation than in the second one, and the expected payoff of B is bigger in the second subpopulation 
than in the first one. This implies that a fraction of A-players in the second population will switch to 
the first one and at the same time a fraction of S-players from the first population will switch to the 
second one - migration causes the increase of the fraction of individual of the first population playing 
A. However, any i?-player will have more offspring than any ^d-player (we are below a mixed Nash 
equilibrium) and this has the opposite effect on relative number of ^-players in the first population 
than the migration. The asymptotic composition of the whole population depends on the competition 
between these two processes. 

We derive sufficient conditions for migration and replication rates such that the whole population 
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will be in the long run in a state in which most individuals occupy only one habitat (the first one for 
the above described initial conditions) and play the efficient strategy. 

Let e be a time step. We allow two subpopulations to replicate with different speeds. We assume 
that during any time-step e, a fraction e of the first subpopulation and a fraction ne of the second 
subpopulation plays the game and receives payoffs which are interpreted as the number of their 
offspring. Moreover, we allow a fraction of individuals to migrate to a habitat in which their strategies 
have higher expected payoffs. 

Let rl denote the number of individuals which use the strategy s G {A, B} in the subpopulation 
i G {1, 2}. By Ul we denote the expected payoff of the strategy s in the subpopulation i: 

U\ = ax + 6(1 — x), = cx + d{l — x), 
Ul = ay + b{l-y), = cy + d{l - y), 

where 

I A I A 11 22 

x = — , y = — ' n=rA + rB, r2 = rA + rB; 

X and y denote fractions of A-players in the first and second population respectively. We denote by 
a = the fraction of the whole population in the first subpopulation, where r = ri + r2 is the total 
number of individuals. 

The evolution of the number of individuals in each subpopulation is assumed to be a result of 
the replication and the migration flow. In our model, the direction and intensity of migration of 
individuals with a given strategy will be determined by the difference of the expected payoffs of that 
strategy in both habitats. Individuals will migrate to a habitat with a higher payoff. The evolution 
equations for the number of individuals playing the strategy s, s G {A, B}, in the habitat i, i E {1, 2}, 
have the following form: 

r\{t + e) = R\ + ^A, (7) 
r^(i + e) = i?^ + $B, (8) 
r\{t + e) = R\-^A, (9) 

r|(t + e) =i?|-$B, (10) 

where all functions on the right-hand sides are calculated at the time t. 

Functions Rl describe an increase of the number of the individuals playing the strategy s in the 
subpopulation i due to the replication: 

Rl = {l-e)rl + dU}rl, (11) 
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R'i = {l-Key,+KeUy„ (12) 

The rate of the repHcation of individuals playing the strategy s in the first subpopulation is given 
by eUj , and in the second subpopulation by nellg. The parameter k measures the difference of 
reproduction speeds in both habitats. 

Functions $5, s G {^4, B}, are defined by 

= 67(f/i - U^)[r!e{U} - C/f ) + rle{U^ - U})], (13) 
where G is the Heaviside's function, 

and 7 is the migration rate. 
Functions $s describe changes of the numbers of the individuals playing strategy s in the relevant 
habitat due to migration. $s will be referred to as the migration of individuals (who play the strategy 
s) between two habitats. 

Thus, if for example U\ > U\, then there is a migration of individuals with the strategy A from 
the second habitat to the first one: 

= hr\{U\ - Ul), (15) 

and since then necessarily < U% [note that U\ — U\ = {a—h){x—y) and U^ — U% = (c— (i)(x— y)], 
there is a migration flow of individuals with strategy B from the first habitat to the second one: 

= eir'smul - Ul). (16) 

In this case, the migration flow $a describes the increase of the number of individuals which play 
the strategy A in the first subpopulation due to migration of the individuals playing A in the second 
subpopulation. This increase is assumed to be proportional to the number of individuals playing A in 
the second subpopulation and the difference of payoffs of this strategy in both subpopulations. The 
constant of proportionality is e times the migration rate 7. 

The case 7 = corresponds to two separate populations which do not communicate and evolve 
independently. Our model reduces then to the standard discrete-time replicator dynamics. In this 
case, the total number of players who use a given strategy changes only due to the increase or decrease 
of the strategy fitness, as described by functions defined in (11-12). 

In the absence of the replication, there is a conservation of the number of individuals playing each 
strategy in the whole population. This corresponds to our model assumption that individuals can not 
change their strategies but only habitats in which they live. 

For U\ > U\ we obtain from (7-10) equations for ri{t) and r{t): 
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n{t + e) = (l-e)ri(t) 



ri ri 



r2(« + e) = (1 - Ke)r2(t) 



r2 r2 



r(i + 5) = (1 - e)ri(t) + (1 - KS)r2{t) 

+Srit)[aC-Au\ + ^C/^ ) + (1 - aH^-^ul + (19) 
ri ri r2 r2 

where all functions in square brackets depend on t. 

Now, like in the derivation of the standard replicator dynamics, we consider frequencies of indi- 
viduals playing the relevant strategies in both habitats. Thus, we focus on the temporal evolution of 
the frequencies, x and y, and the relative size of the first subpopulation, a. We divide (7) by (17), (9) 
by (18), and (17) by (19). Performing the limit e — > we obtain the following differential equations: 

^ = x[il-x){U\-Uh) 

§ = y[K{l - y){Ul - Ul) + 7[(1 - y){Ul - U\) - il^(C/| - C/^ )]], (21) 
at 1 — a 



da 
dt 



= a(l - a)[xU\ + (1 - x)Uh - (yU^ + (1 - y)C/|)] 



+aj[yii^iux - ul) + (1 - x){uh - ul) 

+a{l - a){K - 1)(1 - yUl - (1 - y)Ul). (22) 

Similar equations arc derived for the case U\ < U\ (since our model is symmetric with respect to 
the permutation of the subpopulations, it is enough to renumerate the relevant indices and redefine 
the parameter k). 

Assume first that U\{d) > U^O), which is equivalent to x(0) > y(0). It follows from (7-10) that 
a fraction of A-players from the subpopulation 2 will migrate to the subpopulation 1 and a fraction 
of i3-players will migrate in the opposite direction. This will cause x to increase and y to decrease. 
However, if .t(0) < x* and y(0) < x*, then U\ < U]^ and U^ < U^, therefore i?-players will have more 
offspring than ^-players. This has the opposite effect on the relative number of A-players in the first 
subpopulation than migration. If a;(0) < y(0), then migration takes place in the reverse directions. 

The outcome of the competition between migration and replication depends, for a given payoff 
matrix, on the relation between x{Q) — y{Q), 7 and k. We are interested in formulating sufficient 
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conditions for the parameters of the model, for which most individuals of the whole population will 
play in the long run the efficient strategy A. We prove the following theorem [55] . 



Theorem 5 // 

7 x(0 - y 0) > max[- , ^ , 

a — c a — 

then x{t) -^t~*oo 1 and y{t) ^t^oo 0. 

IfK<{a- l)/{d - 1), then a{t) ^t-^oo 1- 

If 

j[y(0)-xm>max[^^^,^], 

a — c a — 

then x{t) —>t^oo and y{t) —>t^oo 1- 

If K> (d - l)/(a - 1), then a{t) ^t^oo 0. 

Proof: 

Assume first that x{0) > y{0). Prom (20-21) we get the following differential inequalities: 

^ > x(l - x)[U\ - Uh)+j{Ul - Uh)], (23) 

^ < y(l - vMUl - Ul) + ^{Ul - U\)l (24) 
Using explicit expressions for Ul we get 

dx 

— > x{l-x)[{a-c + d-h)x + h-d + 'y{d-c){x- y)], (25) 

^ < y(l - y)[K[{a - c + d - h)y + h - d\ - ^{a - h){x - y)], (26) 

Wenotethatif7(d-c)(ar(0)-j/(0)) > d-6 then 7(d-c)(x(0)-y(0))+6-d+(a-c+d-6)x(0) > 0, 
i.e. dx/dt{0) > 0. 

Analogously, if 7(a-6)(x(0)-y(0)) > K(a-c), then 7(a-6)(x(0)-y(0)) > K[{a-c+d-b)+b-d] > 
K[{a — c + d — b)y{o) + h — d\, therefore dy/dt{0) < 0. Thus, combining both conditions we conclude 
that x{t) — y{t) is an increasing function so x{t) > y{t) for all t >0, hence we may use (20-22) all the 
time. We get that x{t) — >t-+oo 1 and y{t) -^t-*oo 0, and the first part of the thesis follows. Now from 
(22) it follows that \i a - d + {k - - d) > 0, i.e. k. < {a - l)/{d - 1), then a{t) ^t-^oo 1- 

The second part of Theorem 5, corresponding to initial conditions y(0) > a;(0), can be proved 
analogously, starting from eqs. (7-10) written for the case C/^(0) < U\{Q) and their continuous 
counterparts. We omit details. 

The above conditions for n mean that the population consisting of just j4-players replicates faster 
(exponentially in (a — l)t) than the one consisting of just i?-players (exponentially in {d — l)nt). The 
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same results would follow if the coefficients of the payoff matrix of the game played in one habitat 
would differ from those in the second habitat by an additive constant. 

We showed that introduction of the mechanism of attraction by the habitat with a higher expected 
payoff in the standard replicator dynamics helps the whole population to reach the state in which in 
the long run most individuals play the efficient strategy. 

More precisely, we proved that for a given rate of migration, if the fractions of individuals playing 
the efficient strategy in both habitats are not too close to each other, then the habitat with a higher 
fraction of such players overcomes the other one in the long run. The fraction of individuals playing the 
efficient strategy tends to unity in this habitat and consequently in the whole population. Alternatively, 
we may say that the bigger the rate of migration is, larger is the basin of attraction of the efficient 
equilibrium. In particular, we showed that for a large range of parameters of our dynamics, even if the 
initial conditions in both habitats are in the basin of attraction of the risk-dominant equilibrium (with 
respect to the standard replication dynamics without migration), in the long run most individuals 
play the efficient strategy. 

6 Replicator dynamics with time delay 

Here we consider two-player games with two strategies, two pure non-symmetric Nash equilibria, and a 
unique symmetric mixed one, that is a < c and d < 6 in a general payoff matrix given in the beginning 
of Chapter 3. Let us recall that the Hawk-Dove game is of such type. 

Recently Tao and Wang [92] investigated the effect of a time delay on the stability of the mixed 
equilibrium in the replicator dynamics. They showed that it is asymptotically stable if a time delay 
is small. For sufficiently large delays it becomes unstable. 

We construct two models of discrete-time replicator dynamics with a time delay [2] . In the social- 
type model, players imitate opponents taking into account average payoffs of games played some units 
of time ago. In the biological-type model, new players are born from parents who played in the 
past. We show that in the first type of dynamics, the unique symmetric mixed Nash equilibrium is 
asymptotically stable for small time delays and becomes unstable for large ones when the population 
oscillates around its stationary state. In the second type of dynamics, however, the Nash equilibrium 
is asymptotically stable for any time delay. Our proofs are elementary, they do not rely on the general 
theory of delay differential and difference equations. 

6.1 Social- type time delay 

Here we assume that individuals at time t replicate due to average payoffs obtained by their strategies 
at time t — r for some delay r > (see also a discussion after (32)). As in the standard replicator 
dynamics, we assume that during the small time interval e, only an e fraction of the population takes 
part in pairwise competitions, that is plays games. Let ri{t), i = A,B, be the number of individuals 
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playing at the time t the strategy A and B respectively, r{t) = r^(t) + rB{t) the total number of 
players and x{t) = a fraction of the population playing A. 
We propose the following equations: 

ri{t + e) = {1 - e)ri{t) + eri{t)Ui{t - r); i = A,B. (27) 
Then for the total number of players we get 

r{t + e) = (1 - e)r{t) + er{t)Uo{t - r), (28) 
where Uo{t - r) = x{t)UA{t - r) + (1 - x{t))UB{t - r). 
We divide (27) by (28) and obtain an equation for the frequency of the strategy A, 

.(t + e)-.W = e -'''l;'-''-'-f°''-'l (29) 

1 — e + eUo{t — T) 

and after some rearrangements we get 

x{t + e) - x{t) = -exit){l - x{t))[x{t - r) - x*]- -i^- (30) 

i — e + eUo\t — T) 

where x* = {d — b)/{d — b + a — c) is the unique mixed Nash equilibrium of the game. 
Now the corresponding replicator dynamics in the continuous time reads 

dx{t) 



dt 

and can also be written as 



x{t)[UA{t-T)-Uo{t-T)\ (31) 



= x{t){l - xmUA{t - r) - Usit - r)] 

= -5x{t){l-x{t)){x{t-T)-x*). (32) 

The first equation in (32) can be also interpreted as follows. Assume that randomly chosen players 
imitate randomly chosen opponents. Then the probability that a player who played A would imitate 
the opponent who played B at time t is exactly x{t){l — x{t)). The intensity of imitation depends on 
the delayed information about the difference of corresponding payoffs at time t — r. We will therefore 
say that such models have a social-type time delay. 

Equations (31-32) are exactly the time-delay replicator dynamics proposed and analyzed by Tao 
and Wang [92]. They showed that if r < c— a-|-6— d7r/2(c— a)(6— d), then the mixed Nash equilibrium, 
x*, is asymptotically stable. When r increases beyond the bifurcation value c—a+h—diT/2{c—a){h—d), 
X* becomes unstable. We have the following theorem [2] . 

Theorem 6 x* is asymptotically stable in the dynamics (30) if t is sufficiently small and unstable 
for large enough r. 
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Proof: We will assume that r is a multiple of e, r = me for some natural number m. Observe first 
that if x{t — r) < x*, then x{t + e) > x{t), and if x{t — r) > x*, then x{t + e) < x{t). Let us assume 
first that there is t' such that x{t'),x{t' — e),x{t' — 2e), ...,x{t' — t) < x*. Then x{t), t > t' increases 
up to the moment ti for which x{ti — t) > x*. If such ti does not exist then x{t) — >t_>oo x* and the 
theorem is proved. Now we have x* < x{ti — r) < x{ti — r + e) < . . . < x{ti) and x{ti + e) < x{ti) 
so ti is a turning point. Now x{t) decreases up to the moment t2 for which x{t2 — r) < x* . Again, 
if such t2 does not exist, then the theorem follows. Therefore let us assume that there is an infinite 
sequence, tj, of such turning points. Let rji = \x{ti) — x*\. We will show that rji -^i-^oo 0- 
For t G {ti, ti + e,... , tj+i — 1} we have the following bound for x{t + e) — x{t): 

This means that 

rii+i<{m+l)eKrii, (34) 

where K is the maximal possible value of 77- r-rnz — tt- 

^ 4(l-€+€C/o(t-r)) 

We get that if 

r<^-e, (35) 

then rji — >i_»oo so x{t) converges to x*. 

Now if for every t, \x(t + e) — x*\ < maxj.g|o,i,...,m} ~ ~ x*\i then x{t) converges to x*. 
Therefore assume that there is t" such that \x{t" + e) — x*\ > inaxj^.g{o,i,...,m} — ke) — x*\. If r 
satisfies (35), then it follows that x{t + e), x(i + e + r) are all on the same side of x* and the first part 
of the proof can be applied. We showed that x{t) converges to x* for any initial conditions different 
from and 1 hence x* is globally asymptotically stable. 

Now we will show that x* is unstable for any large enough r. 

Let 7 > be arbitrarily small and consider a following perturbation of the stationary point x*: 
x{t) = x*,t < and x{e) = x* + 7. It folows from (30) that x{ke) = x{e) for k = 1, ...,m + 1. Let 
K' = minj.g[3,*_^ 3,*^^] . If ^eK'^ > 2j, that is r > then it follows from (30) that 

after m/2 steps (we assume without loss of generality that m is even) x{{m + 1 + m/2)e) < x* — 7. 
In fact we have x{{2m + l)e) < . . . < x{{m + l)e) and at least m/2 of x's in this sequence are 
smaller than x* — 7. Let t > (2m + l)e be the smallest t such that x{t) > x* — j. Then we have 
x{t — me), . . . ,x{t — e) < x* — 7 < x{t) hence after m/2 steps, x{t) crosses x* + 7 and the situation 
repeats itself. 

We showed that if 

r > (36) 
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then there exists an infinite sequence, ii, such that \x{ii) — x*\ > 7 and therefore x* is unstable. 
Moreover, x{t) oscillates around x*. 

6.2 Biological-type time delay 

Here we assume that individuals born at time t — r are able to take part in contests when they become 
mature at time t or equivalently they are born r units of time after their parents played and received 
payoffs. We propose the following equations: 

nit + e) = (1 - e)ri{t) + eri{t - r)Ui{t - r); i = A, B. (37) 
Then the equation for the total number of players reads 

r{t + e) = (l-e)r(i) 

+.H«)[ ^"":tr U,(t - r) + " - ' f;^" - U,(t - r)l. (38) 
We divide (37) by (38) and obtain an equation for the frequency of the first strategy, 

^it + e) - xit) ^ (39) 

{l-e)^ + eU{t-T) 

where U{t - t) = x{t - rpAit - r) + (1 - x{t - T))UB{t - t). 



We proved in [2] the following 

Theorem 7 x* is asymptotically stable in the dynamics (39) for any value of the time delay r. 

We begin by showing our result in the following simple example. 

The payoff matrix is given by ?7 = ^ ^ ^ ^ hence a;* = ^ is the mixed Nash equilibrium which is 
asymptotically stable in the replicator dynamics without the time delay. The equation (39) now reads 



^(t + e) - xit) = ^ ^(^ - r)(l - x{t - r)) - 2x{t)x{t - r)(l - x{t - r)) 

{l-e)^ + 2ex{t-r){l-xit-r)) 

After simple algebra we get 



x{t + e)-^ + l-x{t) 



x{t - t){1 - x{t - t)) 



e{i-2x{t))- . .(.^ . \,_, " ^, (41) 

r(t-T) 

SO 



{l-e)^^+2ex{t-T){l-x{t-r)) 



+ - \ = i^it) - 7: 



2 2 1 -L '^^(*-^) 



2 I + ^^2xit - T)il - xit - t)) 
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hence 



Wi + e)-^|< (42) 
It follows that X* is globally asymptotically stable. 

Now we present the proof for the general payoff matrix with a unique symmetric mixed Nash 
equilibrium. 

Proof of Theorem 7: 

Let ct = ^^7^^- Observe that if x{t) < x*, then ct > x{t), if x{t) > x*, then q < x{t), and if 
x{t) = X*, then ct = x*. We can write (39) as 

x{t + e) - x{t) 

x{t - T)UA{t - r) - ct-rU{t -t) + ct-rU{t - t) - x{t)U {t - t) 



and after some rearrangements we obtain 



(43) 



X{t + 6) - Ct-r = {X{t) - Ct-r) ^^(,_^) — -. (44) 

Wc get that at time t + e, x is closer to Ct-r than at time t and it is on the same side of We 
will show that c is an increasing or a constant function of x. Let us calculate the derivative of c with 
respect to x. 

c' = M (45) 

{xUA + {l-x)UBy' ^ ' 

where 

/(x) = (ac + hd- 2ad)x^ + 2d{a - b)x + bd. (46) 

A simple analysis shows that / > on (0, 1) or / = on (0, 1) (in the case of a = d = 0). Hence 
c(x) is either an increasing or a constant function of x. In the latter case, Vj;c(x) = x*, as it happens 
in our example, and the theorem follows. 

We will now show that 

\x{t + T + e) - a;*| < max{\x{t) -x*\, \x{t + r) - a;*|} (47) 

hence x{t) converges to x* for any initial conditions different from and 1 so x* is globally asymptot- 
ically stable. 

If x{t) < x* and x{t + r) < x*, then x{t) < ct < x* and also x{t + r) < ct+r < x*. 
From (44) we obtain 



21 



x{t + T) < x{t + T + e) < Ct if X{t + T) <Ct 

< X {t) < X {t + T + e) = Ct if X {t + t) = Ct 

x{t) <ct<x{t + T + e) <x{t + T) if X{t + T)> Ct 

hence (47) holds. 

If x{t) > X* and x {t + t) < x*, then x + r) < x* < Ct < x{t) and either x{t + T) < 
X {t + T + e) < X* or X* < X (t + T + e) < Ct which means that (47) holds. 

The cases of x{t) > x*, x{t + T) > x* and x {t) < x* , x {t + t) < x* can be treated analogously. 
We showed that (47) holds. 

7 Stochastic dynamics of finite populations 

In the next two chapters we will discuss various stochastic dynamics of populations with a fixed number 
of players interacting in discrete moments of time. Wc will analyze symmetric two-player games with 
two or three strategies and multiple Nash equilibria. Wc will address the problem of equilibrium 
selection - which strategy will be played in the long run with a high frequency. 

Our populations arc characterized either by numbers of individuals playing respective strategies in 
well-mixed populations or by a complete profile - assignment of strategies to players in spatial games. 
Let Q he a state space of our system. For non-spatial games with two strategies, Q = {0, 
where n is the number of players or $7 = 2^ for spatial games with players located on the finite subset A 
of Z, Z^, or any other infinite graph, and interacting with their neighbours. In well-mixed populations, 
in discrete moments of times, some individuals switch to a strategy with a higher mean payoff. In 
spatial games, players choose strategies which are best responses, i.e. ones which maximize the sum of 
the payoffs obtained from individual games. The above rules define deterministic dynamics with sonic 
stochastic part corresponding to a random matching of players or a random choice of players who 
may revise their strategies. Wc call this mutation-free or noise-free dynamics. It is a Markov chain 
with a state space $7 and a transition matrix P^. We are especially interested in absorbing states, i.e. 
rest points of our mutation-free dynamics. Now, with a small probability, e, players may mutate or 
make mistakes of not chosing the best reply. The presence of mutatation allows the system to make a 
transition from any state to any other state with a positive probability in some finite number of steps 
or to stay indefinitively at any state for an arbitrarily long time. This makes our Markov chains with 
a transition matrix ergodic ones. They have therefore unique stationary measures. To describe 
the long-run behavior of stochastic dynamics of finite populations, Foster and Young [22] introduced 
a concept of stochastic stability. A state of the system is stochastically stable if it has a positive 
probability in the stationary measure of the corresponding Markov chain in the zero-noise limit, that 
is the zero probability of mistakes or the zero-mutation level. It means that along almost any time 
trajectory the frequency of visiting this state converges to a positive value given by the stationary 
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measure. Let /x^ be the stationary measure of our Markov chain. 

Definition 2 X G is stochastically stable if liia^^o fi'^ {X) > 0. 

It is a fundamental problem to find stochastically stable states for any stochastic dynamics of in- 
terest. We will use the following tree representation of stationary measures of Markov chains proposed 
by Preidlin and Wentzell [107, 23], see also [83]. Let ($7, P^) be an ergodic Markov chain with a state 
space n, transition probabilities given by the transition matrix : J7 x J7 ^ [0, 1], where P^{Y,Y') 
is a conditional probability that the system will be in the state Y' e CI at the time t + 1, if it was in 
the state Y e CI at the time t, and a unique stationary measure, fi^, also called a stationary state. A 
stationary state is an eigenvector of P^ corresponding to the eigenvalue 1, i.e. a solution of a system 
of linear equations. 



fi'P' = (48) 

where is a row wcctor After specific rearrangements one can arrive at an expression 

for the stationary state which involves only positive terms. This will be very useful in describing the 
asymptotic behaviour of stationary states. 

For X ^d, let an X-trcc be a directed graph on Q, such that from every Y ^ X there is a unique 
path to X and there are no outcoming edges out of X. Denote by T{X) the set of all X-trees and let 

9'w= E n p^y^y^ (49) 

deT{X) {Y,Y')ed 

where the product is with respect to all edges of d. 
We have that 

for all X eO.. 

We assume that our noise-free dynamics, i.e. in the case of e = 0, has at least one absorbing state 
and there are no absorbing sets (recurrent classes) consisting of more than one state. It then follows 
from (50) that only absorbing states can be stochastically stable. 

Let us begin with the case of two absorbing states, X and Y. Consider a dynamics in which 
P^(Z, W) for all Z,W e CI, is of order e^, where m is the number of mistakes involved to pass from Z 
to W. The noise-free limit of /x^ in the form (50) has a 0/0 character. Let mxY be a minimal number 
of mistakes needed to make a transition from the state X to y and myx the minimal number of 
mistakes to evolve from Y to X. Then q^{X) is of the order ei^O^^) and q^{Y) is of the order 
If for example niYX < mxY, then lime_>o /x^(X) = 1 hence X is stochastically stable. 
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In general, to study the zero-noise limit of the stationary measure, it is enough to consider paths 
between absorbing states. More precisely, we construct X-trees with absorbing states X^, k = 
as vertices; the family of such X-trees is denoted by T{X). Let 

= max^^^(^) n ^(^'^0, (51) 

(Y,Y')€d 

where P{Y,Y') = maxYl(^^r^^/^ P{W,W'), where the product is taken along any path joining Y 
with Y' and the maximum is taken with respect to all such paths. Now we may observe that if 
lim^^Qqm{X'^) / qm{X^) = 0, for every i = 1, I, i k, then X'' is stochastically stable. Therefore we 
have to compare trees with the biggest products in (51); such trees are called maximal. 

The above characterisation of the stationary measure was used to find stochastically stable states 
in non-spatial [40, 105, 76, 100, 106, 45] and spatial games [17, 18]. We will use it below in our 
examples. 

In many cases, there exists a state X such that lime^o l-'-'^i^) = 1 in the zero-noise limit. Then 
we say that X was selected in the zero-noise limit of a given stochastic dynamics. However, for any 
low but fixed mutation level, when the number of players is very large, the frequency of visiting any 
single state can be arbitrarily low. It is an ensemble of states that can have a probability close to one 
in the stationary measure. The concept of the ensemble stability is discussed in Chapter 9. 

8 Stochastic dynamics of well-mixed populations 

Here we will discuss stochastic dynamics of well-mixed populations of players interacting in discrete 
moments of time. We will analyze two-player games with two strategies and two pure Nash equilibria. 
The efficient strategy (also called payoff dominant) when played by the whole population results in its 
highest possible payoff (fitness). The risk-dominant one is played by individuals averse to risk. The 
strategy is risk dominant if it has a higher expected payoff against a player playing both strategies 
with equal probabilities [29]. We will address the problem of equilibrium selection - which strategy 
will be played in the long run with a high frequency. 

We will review two models of dynamics of a population with a fixed number of individuals. In 
both of them, the selection part of the dynamics ensures that if the mean payoff of a given strategy is 
bigger than the mean payoff of the other one, then the number of individuals playing the given strategy 
increases. In the first model, introduced by Kandori, Mailath, and Rob [40], one assumes (as in the 
standard replicator dynamics) that individuals receive average payoffs with respect to all possible 
opponents - they play against the average strategy. In the second model, introduced by Robson and 
Vega-Redondo [76], at any moment of time, individuals play only one or few games with randomly 
chosen opponents. In both models, players may mutate with a small probability, hence the population 
may move against a selection pressure. Kandori, Mailath, and Rob showed that in their model, the 



24 



risk-dominant strategy is stochastically stable - if the mutation level is small enough we observe it in 
the long run with the frequency close to one [40]. In the model of Robson and Vega-Redondo, the 
efficient strategy is stochastically stable [76, 100]. It is one of very few models in which an efficient 
strategy is stochastically stable in the presence of a risk-dominant one. The population evolves in the 
long run to a state with the maximal fitness. 

The main goal of this chapter is to investigate the effect of the number of players on the long- 
run behaviour of the Robson- Vega-Redondo model [54]. We will discuss a sequential dynamics and 
the one where each individual enjoys each period a revision opportunity with the same probability. 
We will show that for any arbitrarily low but a fixed level of mutations, if the number of players 
is sufficiently large, then a risk-dominant strategy is played in the long run with a frequency closed 
to one - a stochastically stable efficient strategy is observed with a very low frequency. It means 
that when the number of players increases, the population undergoes a transition between an efficient 
payoff-dominant equilibrium and a risk-dominant one. We will also show that for some range of payoff 
parameters, stochastic stability itself depends on the number of players. If the number of players is 
below certain value (which may be arbitrarily large), then a risk-dominant strategy is stochastically 
stable. Only if the number of players is large enough, an efficient strategy becomes stochastically 
stable as proved by Robson and Vega-Redondo. 

Combining the above results we see that for a low but fixed noise level, the population undergoes 
twice a transition between its two equilibria as the number of individuals increases [57]. In addition, 
for a sufficiently large number of individuals, the population undergoes another equilibrium transition 
when the noise decreases. 

Let us formally introduce our models. We will consider a finite population of n individuals who 
have at their disposal one of two strategies: A and B. At every discrete moment of time, t = 1,2, ... 
individuals are randomly paired (we assume that n is even) to play a two-player symmetric game with 
payoffs given by the following matrix: 

A B 

A a b 

U = 

Bed, 

where a > c,d > b,a > d, and a + b<c + d so {A, A) is an efficient Nash equilibrium and {B, B) 
is a risk-dominant one. 

At the time t, the state of our population is described by the number of individuals, zt, playing A. 
Formally, by the state space we mean the set 

= {^,0 < z < n}. 
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Now we will describe the dynamics of our system. It consists of two components: selection and 
mutation. The selection mechanism ensures that if the mean payoff of a given strategy, 7rj(Z(), z = A, B, 
at the time t is bigger than the mean payoff of the other one, then the number of individuals playing 
the given strategy increases in t + 1. In their paper, Kandori, Mailath, and Rob [40] write 

T^A[zt) = ^^^^ , (52) 

czt + d{n - zt-1) 

T^B[Zt) = z , 

77,-1 

provided < zt < n. 

It means that in every time step, players are paired infnitely many times to play the game or 
equivalently, each player plays with every other player and his payoff is the sum of corresponding 
payoffs. This model may be therefore considered as an analog of replicator dynamics for populations 
with a fixed numbers of players. 

The selection dynamics is formalized in the following way: 

Zt+l > Zt if TTAizt) > TTB{zt), (53) 

Zt+1 < Zt if TTAizt) < TTBizt), 

Zt+l = Zt if TTAizt) = TTBizt), 

Zt+1 = Zt if zt = or Zt = n. 

Now mutations are added. Players may switch to new strategies with the probability e. It is 
easy to see that for any two states of the population, there is a positive probability of the transition 
between them in some finite number of time steps. We have therefore obtained an ergodic Markov 
chain with n + 1 states and a unique stationary measure which we denote by fx^. Kandori, Mailath, 
and Rob proved that the risk-dominant strategy B is stochastically stable [40] 

Theorem 8 lime_o/i^(0) = 1 

This means that in the long run, in the limit of no mutations, all players play B. 

The general set up in the Robson-Vega-Redondo model [76] is the same. However, individuals are 
paired only once at every time step and play only one game before a selection process takes place. Let 
Pt denote the random variable which describes the number of cross-pairings, i.e. the number of pairs 
of matched individuals playing different strategics at the time t. Let us notice that pt depends on zt- 
For a given realization of pt and zt, mean payoffs obtained by each strategy are as follows: 

~ / N Q(^t - Pt) + bpt , . 

T^Aizt,Pt) = , (54) 

Zt 

_ , cpt + d{n - Zt- Pt) 

T^B{zt,Pt) = -, 

n- Zt 
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provided < Zf < n. Robson and Vega-Redondo showed that the payoff-dominant strategy is stochas- 
tically stable [76]. 

Theorem 9 lim^^Q ^^{n) = 1 

We will outline their proof. 

First of all, one can show that there exists k such that if n is large enough and zt > k, then 
there is a positive probability (a certain realization of pt) that after a finite number of steps of the 
mutation-free selection dynamics, all players will play A. Likewise, ii zt < k (for any k > 1), then if 
the number of players is large enough, then after a finite number of steps of the mutation-frcc selection 
dynamics all players will play B. In other words, z = and z = n are the only absorbing states of 
the mutation-frcc dynamics. Moreover, if n is large enough, then ii zt > n — k, then the mean payoff^ 
obtained by A is always (for any realization of pt) bigger than the mean payoff obtained by B (in the 
worst case all i?-players play with ^-players) . Therefore the size of the basin of attraction of the state 
z = is at most n — k — 1 and that of z = n is at least n — k. Observe that mutation- free dynamics is 
not deterministic [pt describes the random matching) and therefore basins of attraction may overlap. 
It follows that the system needs at least k + 1 mutations to evolve from z = n to z = and at most 
k mutations to evolve from z = to z = n. Now using the tree representation of stationary states, 
Robson and Vega-Redondo finish the proof and show that the efficient strategy is stochastically stable. 

However, as outlined above, their proof requires the number of players to be sufficiently large. We 
will now show that a risk-dominant strategy is stochastically stable if the number of players is below 
certain value which can be arbitrarily large. 

Theorem 10 If n < then the risk- dominant strategy B is stochastically stable in the case of 

random matching of players. 

Proof: If the population consists of only one S-player and n — 1 A-players and if c > [a{n — 2) + 
6]/(n — 1), that is n < (2a — c — b)/{a — c), then -^b > tt^. It means that one needs only one mutation 
to evolve from z = n to z = 0. It is easy to see that two mutations are necessary to evolve from z = 
to z = n. 

To sec stochastically stable states, we need to take the limit of no mutations. We will now examine 
the long-run behavior of the Robson- Vega-Redondo model for a fixed level of mutations in the limit 
of the infinite number of players. 

Now we will analyze the extreme case of the selection rule (53) - a sequential dynamics where in 
one time unit only one player can change his strategy. Although our dynamics is discrete in time, it 
captures the essential features of continuous-time models in which every player has an exponentially 
distributed waiting time to a moment of a revision opportunity. Probability that two or more players 
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revise their strategies at the same time is therefore equal to zero - this is an example of a birth and 
death process. 

The number of A-players in the population may increase by one in t + 1, if a S-player is chosen in t 
which happens with the probability {n—zt)/n. Analogously, the number of B-players in the population 
may increase by one in t + 1, if an ^-player is chosen in t which happens with the probability {ztj/n. 

The player who has a revision opportunity chooses in t + 1 with the probability 1 — e the strategy 
with a higher average payoff in t and the other one with the probability e. Let 

r{k) = P{7TA{zt,Pt) > 7rB{zt,Pt)) and l{k) = P{TTAizt,Pt) < TTB{zt,Pt))- 

The sequential dynamics is described by the following transition probabilities: 

if = 0, then Zf+i = 1 with the probability e and Zf+i = with the probability 1 — e, 

if zt = n, then zt+i = n — 1 with the probability e and zt+i = n with the probability 1 — e, 

if zt / 0, n, then Zf+i = Zf + 1 with the probability 

r{k)'^^{l - e) + (1 - r{k))^^^e 
and Zt+i = Zf — 1 with the probability 

/(A:)^(l-e) + (l-/(A;))^e. 

In the dynamics intermediate between the parallel (where all individuals can revise their strategies 
at the same time) and the sequential one, each individual has a revision opportunity with the same 
probability r < 1 during the time interval of the lenght 1. For a fixed e and an arbitrarily large but 
fixed n, we consider the limit of the continuous time, r ^ 0, and show that the limiting behaviour is 
already obtained for a sufficiently small r, namely r < e/n^. 

For an interesting discussion on the importance of the order of taking different limits (r — > 0, n — > 
oo, and e — > 0) in evolutionary models (especially in the Aspiration and Imitation model) see Samuelson 
[79]. 

In the intermediate dynamics, instead of (n — zt)/n and zt/n probabilities we have more involved 
combinatorial factors. In order to get rid of these inconvenient factors, we will enlarge the state space 
of the population. The state space ^^ is the set of all configurations of players, that is all possible 

assignments of strategies to individual players. Therefore, a state zt = kinVl consists of ^ ^ ^ states 

in r^'. Observe that the sequential dynamics on Jl' is not anymore a birth and death process. However, 
we are able to treat both dynamics in the same framework. 

We showed in [54] that for any arbitrarily low but fixed level of mutation, if the number of players 
is large enough, then in the long run only a small fraction of the population plays the payoff- dominant 
strategy. Smaller the mutation level is, fewer players use the payoff'-dominant strategy. 
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The following two theorems were proven in [54]. 

Theorem 11 In the sequential dynamics, for any S > and /? > there exist e(5, (3) and n(e) such 
that for any n> n{e) 

l4,{z <l3n)>l-5. 

Theorem 12 In the intermediate dynamics dynamics, for any (5 > and (3 > there exist e{6,P) 
and n(e) such that for any n > n(e) and t < 

IjlI^{z < Pn) >1-S. 

We can combine Theorems 9, 10, and 12 and obtain [57] 

Theorem 13 In the intermediate dynamics, for any 6 > and (3 > there exists e{S, j3) such that, 
for all e < e(5,f3), there exist ui < n2 < n3(e) < n4(e) such that 

ifn<ni = then fi^z = 0) > 1 - S, 

ifn2<n< n3(e), then iJ,^{z = n) > 1 — 6, 

if n > n4(e) and r < e/n^, then < /3n) > 1 — S. 

Small r means that our dynamics is close to the sequential one. We have that n3(e), n4(e), n3(e) — ri2, 
and 714 (e) — ns{e) oo when e — >■ 0. 

It follows from Theorem 13 that the population of players undergoes several equilibrium transi- 
tions. First of all, for a fixed noise level, when the number of players increases, the population switches 
from a S-equilibrium, where most of the individuals play the strategy B, to an ^l-equilibrium and 
then back to B one. We know that if n > n2, then z = n is stochastically stable. Therefore, for 
any fixed number of players, n > n4(e), when the noise level decreases, the population undergoes a 
transition from a B-equilibrium to A one. We see that in order to study the long-run behaviour of 
stochastic population dynamics, we should estimate the relevant parameters to be sure what limiting 
procedures are appropriate in specific examples. 

Let us note that the above theorems concern an ensemble of states, not an individual one. In the 
limit of the infinite number of players, that is the infinite number of states, every single state has 
zero probability in the stationary state. It is an ensemble of states that might be stable [51, 53]. The 
concept of ensemble stability will be discussed in Chapter 9. 

9 Spatial games with local interactions 
9.1 Nash configurations and stochastic dynamics 

In spatial games, players are located on vertices of certain graphs and they interact only with their 
neighbours; see for example [62, 63, 64, 5, 17, 106, 18, 44, 41, 7, 86, 87, 89, 30, 31, 32, 33] and a recent 
review paper [90] and references therein. 
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Let A be a finite subset of the simple lattice Z". Every site of A is occupied by one player who has 
at his disposal one of m different pure strategies. Let S be the set of strategies, then ^\ = is the 
space of all possible configurations of players, that is all possible assignments of strategies to individual 
players. For every z G A, is a strategy of the i—th player in the configuration X E and X_i 
denotes strategies of all remaining players; X therefore can be represented as the pair (Xj, Every 
player interacts only with his nearest neighbours and his payoff is the sum of the payoffs resulting from 
individual plays. We assume that he has to use the same strategy for all neighbours. Let iVj denote the 
neighbourhood of the i—th player. For the nearest-neighbour interaction we have ATj = {j; \ j — i\ = 1}, 
where \i — j\ is the distance between i and j. For X G we denote by t'i(X) the payoff of the i—lYi 
player in the configuration X: 

Ui{X)=Y.U{Xi,Xj), (55) 
where U is a m x m matrix of payoffs of a two-player symmetric game with m pure strategies. 
Definition 3 X e Cl\ is a Nash configuration if for every i £ A and Yi G S, 



Ui{Xi,X^i) > Ui{Yi,X^i) 

Here we will discuss only coordination games, where there are m pure symmetric Nash equilibria 
and therefore m homogeneous Nash configurations, where all players play the same strategy. 

In the Stag-hunt game in Example 1, we have two homogeneous Nash configurations, X^^ and 
X^, where all individuals play St or H respectively. 

We describe now the sequential deterministic dynamics of the best-response rule. Namely, at 
each discrete moment of time t = 1,2,..., a randomly chosen player may update his strategy. He 
simply adopts the strategy, X*"^^, which gives him the maximal total payoff i'i{Xl'^^ , Xt,^) for given 
Xij, a configuration of strategies of remaining players at the time t. 

Now we allow players to make mistakes, that is they may not choose best responses. We will discuss 
two types of such stochastic dynamics. In the first one, the so-called perturbed best response, a 
player follows the best-response rule with probability 1 — e (in case of more than one best-response 
strategy he chooses randomly one of them) and with probability e he makes a mistake and chooses 
randomly one of the remaining strategies. The probability of mistakes (or the noise level) is state- 
independent here. 

In the so called log-linear dynamics, the probability of chosing by the ?'— th player the strategy 
Xj^-^ at the time t + 1 decreases with the loss of the payoff and is given by the following conditional 
probability: 

Pi(xnxi,)^ • (56) 
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Let us observe that ii e ^ 0, p\ converges pointwise to the best-response rule. Both stochastic dy- 
namics are examples of ergodic Markov chains with \S^\ states. Therefore they have unique stationary 
states denoted by 

Stationary states of the log-linear dynamics can be explicitly constructed for the so-called potential 
games. A game is called a potential game if its payoff matrix can be changed to a symmetric one 
by adding payoffs to its columns [49]. As we know, such a payoff transformation does not change 
strategic character of the game, in particular it does not change the set of its Nash equilibria. More 
formally, we have the following definition. 

Definition 4 A two-player symmetric game with a payoff matrix U 

is a potential game if there exists a symmetric matrix V , 

called a potential of the game, such that for any three strategies A, B,C E S 

U{A, C) - U{B, C) = V{A, C) - V{B, C). (57) 

It is easy to see that every game with two strategies has a potential V with V{A, A) = a — c, 
V{B, B) = d — b, and V{A, B) = V{B, A) = 0. It follows that an equilibrium is risk-dominant if and 
only if it has a bigger potential. 

For players on a lattice, for any X G 

V{X)= J2 nXi,Xj) (58) 

is then the potential of the configuration X. 

For the sequential log-linear dynamics of potential games, one can explicitely construct stationary 
measures [106]. 

Wc begin by the following general definition concerning a Markov chain with a state space fl and 
a transition matrix P. 

Definition 5 A measure /j. on Q, satisfies a detailed balance condition if 

pi{X)PxY = I^{Y)Pyx 

for every X,Y E ^ 
Lemma 

// fi satisfies the detailed balance condition then it is a stationary measure 
Proof: 

5^ li{X)PxY = E KY)Pyx = /x(F) 

x<=n x<=n 

The following theorem is due Peyton Young [106]. We will present here his proof. 
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Theorem 14 The stationary measure of the sequential log-linear dynamics in a game with the poten- 
tial V is given by 

eiV{x) 

A{X) = -r^ (59) 

Proof: 

We will show that in (59) satisfies the detailed balance condition. Let us notice that in the 
sequential dynamics, Pxy = unless X = Y or Y differs fom X at one lattice site only, say z G A. 
Let 

X ' ' ' 



Then 



We may now explicitly perform the limit e — in (59). In the Stag- hunt game, X^ has a bigger 
potential than X^* so lim^^Q fi^j^{X^) = 1 hence X^ is stochastically stable (we also say that H is 
stochastically stable). 

The concept of a Nash configuration in spatial games is very similar to the concept of a ground-state 
configuration in latticc-gas models of interacting particles. We will discuss similarities and differences 
between these two systems of interacting entities in the next section. 

9.2 Ground states and Nash configurations 

We will present here one of the basic models of interacting particles. In classical lattice-gas models, 
particles occupy lattice sites and interact only with their neighbours. The fundamental concept is that 
of a ground-state configuration. It can be formulated conveniently in the limit of an infinite lattice 
(the infinite number of particles). Let us assume that every site of the Z'^ lattice can be occupied by 
one of m different particles. An infinite-lattice configuration is an assignment of particles to lattice 
sites, i.e. an clement of $7 = {1, If X G ft and i G Z'^, then we denote by Xi a restriction 

of X to i. We will assume here that only nearest-neighbour particles interact. The energy of their 
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interaction is given by a symmetric mx m matrix V. An element V{A, B) is the interaction energy of 
two nearest-neighbour particles of the type A and B. The total energy of a system in the configuration 
X in a finite region A C Z'' can be then written as 



{jj)cA 

y is a local excitation of X, Y X, Y,X ^ Q , if there exists a finite A C Z'^ such that X = Y 
outside A. 

For y ~ X, the relative energy is defined by 



where the summation is with respect to pairs of nearest neighbours on Z'^. Observe that this is the 
finite sum; the energy difference between Y and X is equal to zero outside some finite A. 

Definition 6 X E ft is a ground-state configuration of V if 

H{Y, X)>0 for any y ~ X. 

That is, we cannot lower the energy of a ground-state configuration by changing it locally. 
The energy density e{X) of a configuration X is 



where |A| is the number of lattice sites in A. 

It can be shown that any ground-state configuration has the minimal energy density [85]. It 
means that local conditions present in the definition of a ground-state configuration force the global 
minimization of the energy density. 

We see that the concept of a ground-state configuration is very similar to that of a Nash config- 
uration. We have to identify particles with agents, types of particles with strategies and instead of 
minimizing interaction energies we should maximize payoffs. There are however profound differences. 
First of all, ground-state configurations can be defined only for symmetric matrices; an interaction 
energy is assigned to a pair of particles, payoffs are assigned to individual players and may be differ- 
ent for each of them. Ground-state configurations are stable with respect to all local changes, Nash 
configurations are stable only with respect to one-player changes. It means that for the same symmet- 
ric matrix U, there may exist a configuration which is a Nash configuration but not a ground-state 
configuration for the interaction matrix —U. The simplest example is given by the following matrix: 




(60) 




(61) 



e{X) = lim inf 




(62) 
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Example 5 

A B 
A 2 

U = 

B 1 

{A, A) and {B,B) are Nash configurations for a system consisting of two players but only {A, A) is 
a ground-state configuration for V = —U. We may therefore consider the concept of a ground-state 
configuration refinement of a Nash equilibrium. 

For any classical lattice-gas model there exists at least one ground-state configuration. This can 
be seen in the following way. We start with an arbitrary configuration. If it cannot be changed locally 
to decrease its energy it is already a ground-state configuration. Otherwise we may change it locally 
and decrease the energy of the system. If our system is finite, then after a finite number of steps we 
arrive at a ground-state configuration; at every step we decrease the energy of the system and for 
every finite system its possible energies form a finite set. For an infinite system, we have to proceed 
ad infinitum converging to a ground-state configuration (this follows from the compactness of in 
the product of discrete topologies). Game models are different. It may happen that a game with a 
nonsymmetric payoff matrix may not posess a Nash configuration. The classical example is that of 
the Rock-Scissors-Paper game. One may show that this game dos not have any Nash configurations 
on Z and but many Nash configurations on the triangular lattice. 

In short, ground-state configurations minimize the total energy of a particle system, Nash config- 
urations do not necessarily maximize the total payoff of a population. 

Ground-state configuration is an equilibrium concept for systems of interacting particles at zero 
temperature. For positive temperatures, we must take into account fiuctuations caused by thermal 
motions of particles. Equilibrium behaviour of the system results then from the competition between 
its energy V and entropy S (which measures the number of configurations corresponding to a macro- 
scopic state), i.e. the minimization of its free energy F = V — TS, where T is the temperature of the 
system - a measure of thermal motions. At the zero temperature, T = 0, the minimization of the free 
energy reduces to the minimization of the energy. This zero-temperature limit looks very similar to 
the zero-noise limit present in the definition of the stochastic stability. Equilibrium behaviour of a 
system of interacting particles can be described by specifying probabilities of occurence for all particle 
configurations. More formally, it is described by a Gibbs state (sec [26] and references therein). 

We construct it in the following way. Let A be a finite subset of Z'^ and the following probability 
mass function on = (Ij ■■■,m)^: 

Pa{X) = {l/zD eM-HA{X)/T), (63) 
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for every X E U\, where 

= E eM-HA{X)/T) (64) 
xenA 

is a normalizing factor. 

We define a Gibbs state as a limit of as A ^ Z'^. One can prove that a limit of a 
translation-invariant Gibbs state for a given interaction as T ^ is a measure supported by ground- 
state configurations. One of the fundamental problems of statistical mechanics is a characterization 
of low-temperature Gibbs states for given interactions between particles. 

Let us observe that the finite-volume Gibbs state in (63) is equal to stationary state fj,\ in (59) if 
we identify T with e and V —V. 

9.3 Ensemble stability 

The concept of stochastic stability involves individual configrirations of players. In the zero-noise limit, 
a stationary state is usually concentrated on one or at most few configurations. However, for a low 
but fixed noise and for a sufficiently large number of players, the probability of any individual config- 
uration of players is practically zero. The stationary measure, however, may be highly concentrated 
on an ensemble consisting of one Nash configuration and its small perturbations, i.e. configurations 
where most players use the same strategy. Such configurations have relatively high probability in the 
stationary measure. We call such configurations ensemble stable. Let iJ,\ be a stationary measure. 

Definition 7 X e Cl\ is 7-ensemble stable if ij,%{Y G Qa; Yi / Xj) < 7 for any i G A i/ A D A(7) 
for some A(7). 

Definition 8 X e Cl\ is low-noise ensemble stable if for every 7 > there exists 6(7) such that 
if e < €(7), then X is j-ensemble stable. 

If X is 7-cnscmble stable with 7 close to zero, then the ensemble consisting of X and configurations 
which are different from X at at most few sites has the probability close to one in the stationary 
measure. It does not follow, however, that X is necessarily low-noise ensemble or stochastically stable 
as it happens in examples presented below [51]. 
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Example 6 

Players are located on a finite subset A of 7? (with periodic boundary conditions) and interact with 
their four nearest neighbours. They have at their disposal three pure strategies: A, and C. The 
payoffs are given by the following symmetric matrix: 





A 


B 


C 


A 


1.5 





1 


B 





2 


1 


C 


1 


1 


2 



Our game has three Nash equilibria: (A, A)^ {B, B), and (C, C), and the corresponding spatial game 
has three homogeneous Nash configurations: X^,X^ , and X'-^ , where all individuals are assigned the 
same strategy. Let us notice that X^ and X'-^ have the maximal payoff in every finite volume and 
therefore they are ground-state configurations for —U and X"^ is not. 

The unique stationary measure of the log-linear dynamics (56) is is given by (59) with U = V 
which is a finite-volume Gibbs state (63) with V replaced by —U and T by e. We have 



^ UiX^Xf)- J2 U(Y,,Y,)>0, 
for every Y ^ X^ and X^ , k = B,C, and 

y: u{xtxf)= y: u{xf,xf). 

It follows that lime^o A*a(-'^'^) = 1/2, for k = B,C so X^ and X*-^ are stochastically stable. Let 
us investigate the long-run behaviour of our system for large A, that is for a large number of players. 
Observe that 

lim Lt%(X) = 

ioT every X e n = 

Therefore, for a large A we may only observe, with reasonably positive frequencies, ensembles of 
configurations and not particular configurations. We will be interested in ensembles which consist of 
a Nash configuration and its small perturbations, that is configurations, where most players use the 
same strategy. We perform first the limit A ^ and obtain an infinite-volume Gibbs state in the 
temperature T = e, 

= lim ua. (65) 

In order to investigate the stationary state of our example, we will apply a technique developed 
by Bricmont and Slawny [8, 9]. They studied low-temperature stability of the so-called dominant 
ground-state configurations. It follows from their results that 
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n'{Xi = C)>1- (5(e) (66) 
for any z G and d{e) ^ as e ^ [51]. 

The following theorem is a simple consequence of (66). 

Theorem 15 is low-noise ensemble stable. 

We see that for any low but fixed e, if the number of players is large enough, then in the long run, 
almost all players use C strategy. On the other hand, if for any fixed number of players, e is lowered 
substantially, then B and C appear with frequencies close to 1/2. 

Let us sketch briefly the reason of such a behavior. While it is true that both and X*-" 
have the same potential which is the half of the payoff of the whole system (it plays the role of the 
total energy of a system of interacting particles), the X'-'' Nash configuration has more lowest-cost 
excitations. Namely, one player can change its strategy and switch to cither A or B and the potential 
will decrease by 4 units. Players in the X^ Nash configuration have only one possibility, that is to 
switch to C; switching to A decreases the potential by 8. Now, the probability of the occurrence of any 
configuration in the Gibbs state (which is the stationary state of our stochastic dynamics) depends 
on the potential in an exponential way. One can prove that the probability of the ensemble consisting 
of the X'^' Nash configuration and configurations which are different from it at few sites only is much 
bigger than the probability of the analogous X^-cnsemblc. It follows from the fact that the X^- 
ensemble has many more configurations than the X-^-cnsemble. On the other hand, configurations 
which are outside X^ and X'^-ensembles appear with exponentially small probabilities. It means that 
for large enough systems (and small but not extremely small e) we observe in the stationary state the 
X^ Nash configuration with perhaps few different strategies. The above argument was made into a 
rigorous proof for an infinite system of the closely related lattice-gas model (the Blume-Capel model) 
of interacting particles by Bricmont and Slawny in [8] . 

In the above example, X^ and X^' have the same total payoff but X*-" has more lowest-cost 
excitations and therefore it is low-noise ensemble stable. We will now discuss the situation, where X^ 
has a smaller total payoff but nevertheless in the long run C is played with a frequency close to 1 if 
the noise level is low but not extremely low. We will consider a family of games with the following 
payoff matrix: 

Example 7 

ABC 
A 1.5 1 
U= B 2+a 1 
CI 12, 
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where a > so 5 is both payoff and pairwise risk-dominant. 

We are interested in the long-run behavior of our system for small positive a and low e. One may 
modify the proof of Theorem 15 and obtain the following theorem [51]. 

Theorem 16 For every 7 > 0, there exist a{^) and 6(7) such that for every < a < 0(7), there 
exists e(a) such that for e(a) < e < 6(7), X*^ is ^-ensemble stable, and for < e < e(a), is 
^-ensemble stable. 

Observe that for a = 0, both and X^ are stochastically stable (they appear with the frequency 
1/2 in the limit of zero noise) but X*^ is low-noise ensemble stable. For small a > 0, X^ is both 
stochastically (it appears with the frequency 1 in the limit of zero noise) and low-noise ensemble stable. 
However, for an intermediate noise e(a) < e < €(7), if the number of players is large enough, then 
in the long run, almost all players use the strategy C {X'-^ is ensemble stable). If we lower e below 
e{a), then almost all players start to use the strategy B. e = e(a) is the line of the first-order phase 
transition. In the thermodynamic limit, there exist two Gibbs states (equilibrium states) on this line. 
We may say that at e = e(a), the population of players undergoes a sharp equilibrium transition 
from C to S-behaviour. 

9.4 Stochastic stability in non-potential games 

Let us now consider non-potential games with three strategies and three symmetric Nash equilib- 
ria: (A, ^), {B,B), and (C, C). Stationary measures of such games cannot be explicitly constructed. 
To find stochastically stable states we will use here the tree representation of stationary measures 
described in Chapter 7. We will discuss some interesting examples. 



Players are located on a finite subset of the one-dimensional lattice Z and interact with their nearest 
neighbours only. Denote by n the number of players. For simplicity we will assume periodic boundary 
conditions, that is we will identify the n + 1-th player with the first one. In other words, the players 
are located on the circle. 

The payoffs are given by the following matrix: 



Example 8 



A 



B 



C 



A 



1 + a 



1.5 



U = 



B 







2 







C 











3 



with 0<a< 0.5. 
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As before, we have three homogeneous Nash configurations: X^, X^, and X'-^ . The log-hnear and 
perturbed best-response dynamics for this game were discussed in [52] . 

Let us note that X^, X^, and X'-^ are the only absorbing states of the noise-free dynamics. We 
begin with a stochastic dynamics with a state-independent noise. Let us consider first the case of 
a < 0.5. 

Theorem 17 IfO<a< 0.5, then X^ is stochastically stable in the perturbed best-response dynamics. 

Proof: It is easy to see that q.m{X^) is of the order e^, qm{X^) is of the order et"^-^, and qm{X'^) is 
of the order e"+^. 

Let us now consider the log-hnear rule. 

Theorem 18 Ifn<2-\- 1/(0.5 — a), then X^ is stochastically stable and ifn>2 + 1/(0.5 — a), then 
X^ is stochastically stable in the log-linear dynamics. 

Proof: The following are maximal A-tree, B-tree, and C-tree: 

B^C^A, C^A^B, A^B^C, 

where the probability of A ^ S is equal to 

^ (: T^-^r^r-'z 1 r, (67) 



l + l + e^(^+^°) 1 -Fe"? -Fe^(-^+'*) l + e~-e-\-e e 
the probability of i? — >^ C is equal to 



^ (: r ttT-'- 1 r, (68) 



1 + 1 + e^ l + e"t-Ke~"r' l-\-e~e-\-e 



and the probability of C — > yl is equal to 

1 . 1 



( = = (QQ) 

1 + e-f +ef l + e-7(2-5+")+e7(°-5-")^ 1 + e"! + e"! ' 
Let us observe that 

= O(e-7(7+(0-5-")("-2))), (70) 

PC^A^B = O(e-7(5+2a+{0.5-a){n-2)))^ (^-^^ 

PA_B^C = 0(e-7(6+2«)), (72) 

where ]imx-^oO{x)/x = 1. 

Now if n < 2 -I- 1/(0.5 - a), then 

lim^4$^ = lim:Pi^ = (73) 
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which finishes the proof. 

It follows that for a small enough n, is stochastically stable and for a large enough n, X'-^ is 
stochastically stable. We see that adding two players to the population may change the stochastic 
stability of Nash configurations. Let us also notice that the strategy C is globally risk dominant. 
Nevertheless, it is not stochastically stable in the log-linear dynamics for a sufficiently small number 
of players. 

Let us now discuss the case of a = 0.5 [52]. 
Theorem 19 If a = 0.5, then X^ is stochastically stable for any n in the log-linear dynamics. 
Proof: 

lim ^-g!! = hm 3 = 0. 



X^ is stochastically stable which means that for any fixed number of players, if the noise is sufficiently 
small, then in the long run we observe B players with an arbitrarily high frequency. However, we 
conjecture that for any low but fixed noise, if the number of players is big enough, the stationary 
measure is concentrated on the X'^-ensemble. We expect that X*-"' is ensemble stable because its 
lowest-cost excitations occur with a probability of the order and those from X^ with a probability 
of the order e e . We observe this phenomenon in Monte-Carlo simulations. 

Example 9 

Players are located on a finite subset A of Z (with periodic boundary conditions) and interact with 
their two nearest neighbours. They have at their disposal three pure strategies: A, B, and C. The 
payoffs are given by the following matrix [51]: 

ABC 
A 3 2 
U = B 2 2 
COOS 

Our game has three Nash equilibria: {A, A), {B, B), and (C, C). Let us note that in pairwise compar- 
isons, B risk dominates A, C dominates B and A dominates C. The corresponding spatial game has 
three homogeneous Nash configurations: X-^, X^, and X'-^ . They are the only absorbing states of the 
noise-free best-response dynamics. 

Theorem 20 X^ is stochastically stable 
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Proof: The following are maximal A-tree, B-tree, and C-tree: 



B^C -^A, C ^ A^B 



A 



B 



C. 



Let us observe that 



Pb-.c^a = 0{e 



Pc^A^B = 0(6 




(74) 



(75) 



PA-.B-.C = 0{e 




(76) 



The theorem follows from the tree characterization of stationary measures. 

is stochastically stable because it is much more probable (for low e) to escape from X"^ and X'-' 
than from X^ . The relative payoffs of Nash configurations are not relevant here (in fact X^ has the 
smallest payoff). Let us recall Example 7 of a potential game, where an cnscmblc-stablc configuration 
has more lowest-cost excitations. It is easier to escape from an ensemble-stable configuration than 
from other Nash configurations. 

Stochatic stability concerns single configurations in the zero-noise limit; ensemble stability concerns 
families of configurations in the limit of the infinite number of players. It is very important to 
investigate and compare these two concepts of stability in nonpotential games. 

Non-potential spatial games cannot be directly presented as systems of interacting particles. They 
constitute a large family of interacting objects not thoroughly studied so far by methods statistical 
physics. Some partial results concerning stochastic stability of Nash equilibria in non-potential spatial 
games were obtained in [17, 18, 5, 53, 52]. 

One may wish to say that A risk dominates the other two strategies if it risk dominates them in 
pairwise comparisons. In Example 9, B dominates A, C dominates i?, and finally A dominates C. But 
even if we do not have such a cyclic relation of dominance, a strategy which is pairwise risk-dominant 
may not be stochastically stable as in the case of Example 8. A more relevant notion seems to be that 
of a global risk dominance [45]. We say that A is globally risk dominant if it is a best response to a 
mixed strategy which assigns probability 1/2 to A. It was shown in [17, 18] that a global risk-dominant 
strategy is stochastically stable in some spatial games with local interactions. 

A different criterion for stochastic stability was developed by Blume [5]. He showed (using tech- 
niques of statistical mechanics) that in a game with m strategies Ai and m symmetric Nash equilibria 
[Ak,Ak), k = 1, ...,m, Ai is stochastically stable if 



mm{U{Ai,Ai) -U{Ak,Ak)) > ma^{U{Ak, A^) - U{Ai, Ak)). 

k>l k>l 



(77) 



We may observe that if Ai satisfies the above condition, then it is pairwise risk dominant. 



41 



9.5 Dominated strategies 

We say that a pure strategy is strictly dominated by another (pure or mixed) strategy if it gives a 
player a lower payoff than the other one regardless of strategies chosen by his opponents. 

Definition 9 k E S is strictly dominated by y E A if Ui{k, w^i) < Ui{y, w-i) for every w G 

Let us see that a strategy can be strictly dominated by a mixed strategy without being strictly 
dominated by any pure strategy in its support. 

Example 10 



A 


B 


C 


A 5 


1 


3 


B 2 


2 


2 


C 1 


5 


3 



B is strictly dominated by a mixed strategy assigning the probability 1/2 both to A and C but is 
strictly dominated neither by A nor by C. 

It is easy to see that strictly dominated pure strategies cannot be present in the support of any 
Nash equilibrium. 

In the replicator dynamics (16), all strictly dominated pure strategies are wiped out in the long 
run if all strategies are initially present [1, 78]. 

Theorem 21 If a pure strategy k is strictly dominated, 
then ^k{tiX^) ~*'t->oo for any G interior{A). 

Strictly dominated strategics should not be used by rational players and consequently we might 
think that their presence should not have any impact on the long-run behaviour of the population. 
We will show that in the best-reply dynamics, if we allow players to make mistakes, this may not 
be necessarily true. Let us consider the following game with a strictly dominated strategy and two 
symmetric Nash equilibria [51]. 

Example 11 





A 


B 


C 


A 





0.1 


1 


B 


0.1 


2 + a 


1.1 


C 


1.1 


1.1 


2, 



where a > 0. 
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We see that strategy A is strictly dominated by both B and C, hence is not a Nash configuration. 

and X^ are both Nash configurations but only X^ is a ground-state configuration for —U. In the 
absence of A, B is both payoff and risk-dominant and therefore is stochastically stable and low-noise 
ensemble stable. Adding the strategy A does not change dominance relations; B is still payoff and 
pairwise risk dominant. However, Example 11 fulfills all the assumptions of Theorem 16 and we get 
that X*-^ is 7-ensemble stable at intermediate noise levels. The mere presence of a strictly dominated 
strategy A changes the long-run behaviour of the population. 

Similar results were discussed by Myatt and Wallace [58]. In their games, at every discrete moment 
of time, one of the players leaves the population and is replaced by another one who plays the best 
response. The new player calculates his best response with respect to his own payoff matrix which is 
the matrix of a common average payoff modified by a realization of some random variable with the zero 
mean. The noise does not appear in the game as a result of players' mistakes but is the effect of their 
idiosyncratic preferences. The authors then show that the presence of a strictly dominated strategy 
may change the stochastic stability of Nash equilibria. However, the reason for such a behavior is 
different in their and in our models. In our model, it is relatively easy to get out of X'-^ and this 
makes X*^ ensemble stable. Mayatt and Wallace introduce a strictly dominated strategy in such a 
way that it is relatively easy to make a transition to it from a risk and payoff-dominant equilibrium 
and then with a high probability the population moves to a second Nash configuration which results 
in its stochastic stability. 

This is exactly a mechanism present in Examples 8 and 9. 

10 Review of other results 

We discussed the long-run behaviour of populations of interacting individuals playing games. We have 
considered deterministic replicator dynamics and stochastic dynamics of finite populations. 

In spatial games, individuals are located on vertices of certain graphs and they interact only with 
their neighbours. 

In this paper, we considered only simple graphs - finite subsets of the regular Z or 7? lattice. Re- 
cently there appeared many interesting results of evolutionary dynamics on random graphs, Barabasi- 
Albert free-scale graphs, and small-world networks [88, 91, 101, 102, 90, 80, 81, 82, 3]. Especially the 
Prisoner's Dilemma was studied on such graphs and it was shown that their heterogeneity favors the 
cooperation in the population [80, 81, 82, 90]. 

In well-mixed populations, individuals are randomly matched to play a game. The deterministic 
selection part of the dynamics ensures that if the mean payoff of a given strategy is bigger than the 
mean payoff of the other one, then the number of individuals playing the given strategy increases. 
In discrete moments of time, individuals produce offspring proportional to their payoffs. The total 
number of individuals is then scaled back to the previous value so the population size is constant. 
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Individuals may mutate so the population may move against a selection pressure. This is an example 
of a stochastic frequency-dependent Wright-Fisher process [20, 21, 108, 12, 19]. 

There are also other stochastic dynamics of finite populations. The most important one is the 
Moran process [50, 12, 19]. In this dynamics, at any time step a single individual is chosen for 
reproduction with the probability proportional to his payoff, and then his offspring replaces the random 
chosen individual. It was showed recently that in the limits of the infinite population, the Moran 
process results in the replicator dynamics [95, 96]. 

The stochastic dynamics of finite populations has been extensively studied recently [65, 93, 31, 
38, 43, 69, 70, 71, 98, 99]. The notion of an evolutionarily stable strategy for finite populations 
was introduced [65, 93, 61, 104, 15, 97]. One of the important quantity to calculate is the fixation 
probability of a given strategy. It is defined as the probability that a strategy introduced into a 
population by a single player will take over the whole population. Recently, Nowak et. al. [65] have 
formulated the following weak selection 1/3 law. In two-player games with two strategies, selection 
favors the strategy A replacing B if the fraction of A-players in the population for which the average 
payoff for the strategy A is equal to the average payoff of the strategy B if is smaller than 1/3, i.e. 
the mixed Nash equilibrium for this game is smaller than 1/3. The 1/3 law was proven to hold both 
in the Moran [65, 93] and the Wright-Fisher process [38]. 

In this review we discussed only two-player games. Multi-player games were studied recently in 
[42, 10, 11, 53, 28, 74, 39, 75]. 

We have not discussed at all population genetics in the context of game theory. We refer to [36, 12] 
for results and references. 
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